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ABSTRACT 

Flumans, like almost all animals, are phase-locked to the diurnal cycle. Most of us sleep at 
night and are active through the day. Because we have evolved to function with this cycle, the 
circadian rhythm is deeply ingrained and even detectable at the biochemical level. Flowever, within 
the broader day-night pattern, there are individual differences; e.g., some of us are intrinsically 
morning-active, while others prefer evenings. In this article, we look at digital daily cycles: 
circadian patterns of activity viewed through the lens of auto-recorded data of communication and 
online activity. We begin at the aggregate level, discuss earlier results, and illustrate differences 
between population-level daily rhythms in different media. Then we move on to the individual level, 
and show that there is a strong individual-level variation beyond averages: individuals typically 
have their distinctive daily pattern that persists in time. We conclude by discussing the driving 
forces behind these signature daily patterns, from personal traits (morningness/eveningness) to 
variation in activity level and external constraints, and outline possibilities for future research. 
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1 INTRODUCTION 


Almost all life on Earth is affected by the planet’s 24-hour period of rotation. Humans are no different; the 
rhythms of our lives are phase-locked with the diurnal cycle. Because our bodies have evolved to cope with 
the external environment, we have genetic circadian pacemaker circuits that intrinsically follow a period of 
approximately 24 hours (the circadian period length may vary from one person to another, vary by age 
and there are known gender differences (Schmidt et al., 2012[puffy et al.[ 201 1| )). The operation of these 
circadian circuits manifests at various levels: biochemical, physiological, psychological, and in various 
markers from hormone levels to body temperature (Kerkhof[ 1985, Czeisler et al.[ 1999[[Pmda et al.[ 2002 


|Baehr et’aL 2000). While our daily rhythms can be modulated by exogenous factors (e.g. decoupling 
alertness from the sleep/wake cycle (Folkard et al. 19851), there is a very strong endogenous component in 
these rhythms, as indicated by the persistence of a near-24 hour rhythm in the absence of environmental 
cues or despite imposition of a non-24 hour schedule (Kleitman, 1963[ Wever 1979). 


Within this broader pattern, however, there are substantial inter-individual differences. Such differences 
are apparent in the existence of chronotypes - morning types and evening types, those who go to bed early 
and those who find it difficult to wake up early. The traits of morningness and eveningness correlate with 


1 
































Aledavood et al. 


On the Digital Daily Cycles of Individuals 


distinctive temporal patterns of physiologieal and psyehologieal variables, sueh as body temperature and 
effieieney. They also appear to be linked to gender as well as personality traits; in partieular, studies have 
shown weak negative eorrelations of momingness with extraversion and soeiability ( |Tsaousis| 2010 Keren 

E^|2OT0| ). 


The daily rhythms that humans follow are visible in the digital reeords that are left in the wake of human 
online aetivity. Population-level and system-level daily rhythms ean be observed in time variation of aetivity 


in Youtube, Twitter and Slashdot, and in frequeney of edits in Wikipedia and OpenStreetMap ( [Gill et aL| 
2007[[Kaltenbrunner et al. 2008 [ Yasseri et al4|2012] 2013). They are also seen in the frequeney of mobile 


telephone ealls (Jo et al. 2012; Krings et al. 20121, and in traees of human mobility derived from mobile 
phone data ( |Song et ^|2010[[Ahas et aLj|2010^|Louail et aT||2014[ ). But what do the eireadian patterns 
displayed by aetivity levels in an online system aetually reveal about human behaviour? The behaviour 
of an online system is determined by a number of faetors: the day/night eyele, the funetion and purpose 
of the system in question (e.g. work-related emails mostly being sent during offiee hours, see below), the 
variation of behaviours of user groups (e.g. Wikipedia edits from multiple time zones), and, importantly, 
variation at the individual level. 

In this paper, we diseuss findings regarding the daily patterns in eleetronie reeords of human eommu- 
nieation, along with results of analyses that illustrate sueh patterns in four different datasets. We start at 
the aggregate level, studying system-level average patterns and diseuss the origins of the findings. From 
the system level, we will move on to the level of individuals, and foeus on the variation that remains 
hidden within system-level averages: individual differenees refleeted in persistent, distinet daily aetivity 
patterns. This part eonfirms that earlier findings of persistent individual differenees in a mobile telephone 


dataset (Aledavood et al. 2015al are general, and that persistent, distinet daily patterns of individuals 
are eommon to different eommunieation ehannels. We eonelude by diseussing the implieations of these 
findings, and address future researeh questions from large-seale analysis of sleep habits of individuals with 
big data to daily aetivity patterns as part of digital phenotypes. 


2 DAILY PATTERNS AT THE AGGREGATE LEVEL 
2.1 Previous work 

Let us begin by diseussing observations of digital daily eyeles in different systems at the aggregate 
level, eomputed from digital reeords of eommunieation and online aetivity. In every instanee where the 
temporal variation of the aetivity levels in sueh systems is monitored, the result is a periodie pattern of 
aetivity on several time seales (Saramaki, J. and Moro, E.[ 20151. The longest seale is that of a ealendar 


year, where speeial periods sueh as holidays ean typieally be distinguished (see, e.g., Krings et al. (2012)). 


Then there is a weekly eyele, where weekends typieally differ from weekdays, and where there ean be 
differenees between weekdays as well (Gill et aL| 2007, Kaltenbrunner et ahj 2008; [Ewings et al. 2012 


Yasseri et al.| 2012[ Vajna[ 2012). Finally, there is a daily pattern whieh may signifieantly differ between 
different systems. 

We stress that any observed system-level pattern rises out of the superposition of a multitude of individual 
patterns, and attributing system-level behaviour to individuals would amount to an eeologieal fallaey. 
Therefore, interpreting what the system-level patterns represent remains a non-trivial task. Solving the 
problem of disentangling the superposition of daily patterns, however, may provide important information 
of the user population. A good example of this is ( Yasseri et al.[ 2012[ ), where the authors studied Wikipedia 
in various languages, and were able to infer the geographieal spread of their editor base from the assumption 
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that the observed edit frequeney eyeles are a superposition of eireadian patterns on different time zones. 
The method is based on the argument that Wikipedias in different languages exhibit universal daily patterns, 
with minima and maxima at around the same time of the day (when eorreeting for time zones). 


Temporal patterns of aetivity have been studied for different online platforms. For example, in Yasseri 


et al.| ( |2013| ), the authors look at differenees between editing patterns on OpenStreetMap, whieh is a 
geo-wiki, for two different eities (London and Rome). Cireadian patterns of edits for the two eities have 
been eompared to eaeh other and to that of Wikipedia edits. The authors also followed ehanges in the 


eireadian rhythms for eaeh of the two eities over several years. In ten Thij et al. (2014), daily and weekly 
patterns of Twitter aetivity in different languages have been studied and it has been shown that eireadian 


patterns emerge for tweets in all the studied languages. In Noulas et al. (20111, the authors have looked at 
data from Foursquare and found geo-temporal rhythms in aetivity both for weekdays and weekends. 

Analysis of aggregate-level daily eyeles with geospatial information has been used in the eontext of eities 
and transport. As an example, in Toole et al. ( |2012 ), the authors infer dynamie land use of different parts of 
a eity based on temporal patterns of mobile phone aetivity in different loeations. In |Ahas et ak (2010), 
temporal data is eombined with loeation data from mobile phones. Comparing daily rhythms for different 
days of the week, the authors show a signifieant differenee in mobility of suburban eommuters in eity of 
Tallinn on weekends as eompared to work days. In Louail et al. ( 2014| ), the authors investigate the daily 
rhythms of different Spanish eities in terms of spatiotemporal patterns of mobile phone usage, and show 
how the strueture of hotspots, plaees of frequent usage, allow them to distinguish between different eities. 


In Dong et al. (20151, Call Detail Reeord (CDR) data for a period of 5 months from Cote d’Ivoire is used 


to deteet unusual erowd events and gatherings. 

As a more applied and non-eonventional example of the analysis of daily rhythms, in May 2014 a number 


of different news outlets (e.g., Riley (20141) deseribed how an elaborate eampaign run by Iranian haekers 


on soeial media, targeting Ameriean offieials and figures, was revealed only after analysing the temporal 
patterns of three years of aetivity. The daily and weekly aetivity patterns of the haekers matehed preeisely 
the aetivity profile of Tehran (i.e. low aetivity at luneh hours of Tehran loeal time, and little or no aetivity 
on Thursdays and Fridays whieh are weekend days in Iran). 

Finally, let us mention that eleetronie reeords eontain evidenee of daily/weekly patterns that go beyond 


aetivity rates. Using network analysis Krings et al. (2012) show that when mobile telephone ealls between 


individuals are aggregated to form networks, the struetural features of those networks differ depending on 
the starting time of the aggregation proeess. In partieular, weekends differ from weekdays. It is probable 
that the explanation is that during weekends, eommunieation is mainly targeted to elose friends and relatives 


who reside within the dense eore of one’s egoeentrie network. At a smaller seale, in Aledavood et al. 
(|2015a[), the authors show that elosest friends are frequently ealled in the evenings. 


2.2 Results 


In this work, we study three different datasets, one with ealls, one with ealls and text messages, and one 
eontaining email reeords (]Eekmann et al.[ 2004). For ealls, we use the Reality Mining dataset (Eagle and 


Pentland[ 2006) and another mobile phone dataset eontaining data from a small town in a European eountry 
with a population of around 8000 people, a subset of the data used in e.g. Karsai et al. ( 2011[ )). Eor the 
latter, we also study text messages. Eor all sets, we use 8-week sliees. A summary of different sets ean be 
found in TablePreproeessing of the data is diseussed in Methods. 
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Table 1. Overview of the datasets used in this study. 


Dataset 

Participants 

Active Users 

Total Events 

Reality Mining Call 

87 

47 

14,187 

Town Call 

1204 

277 

45,844 

Town Text 

708 

64 

13,014 

Email 

2430 

431 

206,723 



200 



60 120 

time (hours) 


Figure 1. Number of events per hour for each day of week in our datasets. This curve has been aggregated over the entire 8-week period. From top to bottom: 
calls in Reality Mining, calls and texts in small town, and emails. We observe strong diurnal patterns in all datasets; for the small town datasets there are also 
differences between calls and and texts activity. The email dataset shows decreased activity during weekends. 


As the first step, we look at aggregated hourly event frequencies for each of the four different sets (Fig. [T]). 
It is clear that while the sleep/wake cycle is apparent in each set, there are also noticeable differences. Calls 
in the European town show a double-peaked daily curve, whereas the Reality Mining data displays no 
such pattern. It is possible that this is due to different conventions; students in Boston can be expected to 
behave differently than people in a small European town. Note that for the Reality Mining data, time zone 
information is not available, so we have manually shifted them such that the lowest points correspond to 
night and there is a possibility that this estimate is inaccurate. However, this only affects the phase of the 
pattern, not its shape. 
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Figure 2. The daily pattern in each of the datasets, computed as an average over all Wednesdays in the data. Colours are the same as in Fig.^ We observe 
distinct patterns across the various data channels. Email activity is early in the day, whereas (unobtrusive) text messages peak late at night. 


Interestingly, in both eall datasets, the highest peak oeeurs on the fifth day (Friday). Also note the very 
low email aetivity level during the weekend in the email data. For email, time stamps are relative to some 
unknown to, so the daily eyeles appear shifted eompared to the other datasets. 


In Fig. we foeus on the differenee between daily eyeles the various datasets. Here, we plot the average 
daily patterns in eaeh system on the third day of the week. Sinee there is no exaet timezone information 
for Reality Mining and email datasets, we identified the third day of the week by assuming that two 
low-aetivity days eorrespond to the weekend. We also aligned the timelines by assuming that the lowest 
aetivity of the day oeeurs at 4 AM for all datasets. We then average over the third-day patterns aeross all 
eight weeks in eaeh set. As in Aledavood et al. (2015b), we find differenees between the eommunieation 
ehannels: for the small town dataset, the peak of text messages is later than that of ealls. This is perhaps 
due to different nature of these ehannels; while getting ealls in the late hours might not be appreeiated, 
reeeiving text messages whieh are mueh less obtrusive is still aeeeptable. 


3 DAILY PATTERNS AT THE LEVEL OF INDIVIDUALS 
3.1 Previous work 


In (Aledavood et al. 2015a), two present authors investigated individual-level daily eyeles in mobile 
phone eall data from 24 individuals (12 male and 12 female) over 18 months. The data eolleetion was 
performed in a setting where the partieipants eompleted high sehool some months after the eolleetion 
began, and then started their first year at university, often in another eity, or went to work. This design 
guaranteed a high turnover in their soeial networks (Saramaki et al. 2014), and provided an opportunity to 
study a major ehange in their life eireumstanees. Looking at individual-level daily eall patterns, however, 
it was elear that there were persistent individual differenees; each individual has their distinctive daily 
cycle despite social network turnover and changes in circumstances. This observation speaks in favour of 
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Figure 3. A sample of 12 individual-level daily patterns for four datasets to illustrate the diverse nature of individual patterns. Columns correspond to datasets, 
while rows A)-C) correspond to different individuals (who are not the same across datasets). The black line shows the average daily pattern for the dataset in 
question—and therefore is the same in each column—whereas green/red areas denote where this individual’s pattern is above or below average. We observe that 
in almost every case, the individual patterns differ strongly from the average behaviour, for example by increased calling frequency during mornings, mid-days, 
or evenings. 


intrinsic factors (such as the aforementioned ehronotypes) dominating individual-level variations in daily 
patterns (see Diseussion). 

3.2 Results 

Continuing the analysis of the four datasets, we first ealculate for eaeh set the daily patterns for each 
individual (“ego”) by eounting the total number of events assoeiated with the ego at eaeh hour of the day 
through the whole 8 weeks. The eounts are then normalised to one for eaeh ego to yield that person’s daily 
activity pattern. As a reference, we also eompute the average pattern over all egos from the normalised 
patterns. 

Fig. [^displays a sample of the individual-level daily patterns for each dataset. For each set, we have 
pieked three egos to demonstrate individual differenees; for eaeh ego, their differenees from the aggregated 
average are emphasized by red and green eolours. For all datasets, we ean observe elear variation between 
individuals. Considering the differences between the aggregate and individual daily eycles serves two 
purposes. While the average pattern in eaeh dataset reveals general underlying meehanisms, the individual 
patterns show that eaeh person has their own preferenees for the timing of eommunieation with others. The 
daily eommunieation cyeles point at variation beyond morningness and eveningness: while individuals 
elearly have different sleep/wake eyeles, they also have their specifie patterns during their wakefulness 
periods. 
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Figure 4. Self and reference distances for daily patterns in our datasets. Self-distance measures the distance between one individual’s daily patterns in two 
consecutive one-month intervals, whereas reference distances are computed between all pairs of individuals in a one-month interval. 


Using the same methodology as Aledavood et al. (2015a) in order to study whether these daily patterns 
for eaeh individual are persistent and thus eharaeteristie for the individual, we divide the 8 weeks of data 
into two 4-week time intervals and use the Jensen-Shannon divergenee to measure self and referenee 
distanees between patterns. A detailed explanation of these ealeulations ean be found in the Methods 
seetion. The results are shown in Fig. We observe an effeet similar to the findings in Aledavood et al. 


(2015a): the daily patterns of individuals tend to be more similar to themselves in eonseeutive time intervals 
as eompared to daily patterns of other individuals in the same time interval. This indieates that individuals 
have distinet daily patterns that retain their shapes in time. In other words, Fig. shows that the individual 
differenees seen in Fig. are not just eaused by random fluetuations: were fluetuations the reason for 
individual differenees, eaeh individual’s patterns in eonseeutive intervals would be equally similar or 
dissimilar to those of everyone else. As self-distanees are on average lower, this is elearly not the ease. 


4 DISCUSSION 


Cireadian rhythms have deep roots in human physiology, driven by the environment in whieh we live. 
These patterns manifest themselves in different ways at the individual and aggregate levels. There are 
diurnal patterns that are only visible at the aggregate level in the overall frequeneies of various phenomena 
that are rare or one-time events at the individual level: time of birth, heart attaeks, suieides or eommitting 


unethieal behaviour (Refinetti 2005 Ruffieux et al. 1992 Kouehaki and Smith 2014). To the eontrary, the 
daily rhythms that we have foeussed on here originate at the level of individuals, where they manifest as 
time-dependent event rates of e.g. digital eommunication. 
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What are the factors that determine an individual’s daily rhythm as viewed through the lens of electronic 
records? The most obvious one is the sleep/wake cycle: we do not send emails or edit Wikipedia while asleep. 
This is known to be the central driver behind individual differences. First, individuals have different intrinsic 
chronotypes (morningness/eveningness tendencies ( |Kerkhof| 19851). Second, the preferred duration of sleep 
also varies from one person to another ([Blatter and Cajochen[ 20071. Third, besides these intrinsic factors. 


external forcing such as different work schedules also have an effect on the sleep/wake cycle (Taillard et al. 


19991. 


In addition to differences in the sleep/wake cycle, our alertness and propensity to sleep are distinct for each 
individual and vary throughout the day. Naturally, individuals go on average through fairly similar cycles 
of wakefulness and sleepiness, which may explain the qualitatively similar features of aggregate-level daily 
patterns across different systems. At the level of individuals, however, there are important differences, 
which are reflected in the observed daily patterns in digital records. As an example, a tired person might be 
less likely to write an important email or edit a Wikipedia article. Likewise, in addition to these intrinsic 
alertness cycles, one’s daily schedule (work, commuting, etc.) plays a role by imposing constraints on the 
times when it is possible to send emails or make calls. In terms of daily patterns of telephone calls, things 
are more complicated, because every call involves two individuals—a caller and a recipient. When calling, 
one must consider social norms and the availability of the other party. 


Understanding which of the factors discussed above dominate the digital daily cycles of individuals and 
give rise to individual differences and persistent circadian patterns is a task that requires further attention. 
While the persistence of daily patterns appears to indicate that the intrinsic components (chronotypes. 


alertness cycles) do play a major role (Aledavood et al. 2015a), external factors should also be of 


importance (see, e.g., Llorente et al. (2014)). Further, it will be necessary to study whether individuals 
bound by (strong) social ties tend to synchronise their communication and availability. 

While analysing digital records at the aggregate level can provide us invaluable population-level insights 
and help to replace or improve traditional survey or census methods ( Deville et al.[ 2014 JToole et al.[ 2012), 
studying the temporal fingerprints of individuals will unveil many new opportunities. As smartphones and 
other wearable devices are becoming ever more ubiquitous, they also increasingly provide high-velocity, 
high-volume data streams describing human behaviour (Torous et al., 2015a| ). This data-collection capability 
makes these devices excellent tools for research, particularly within health, psychology and medicine, 
since smartphones allow researchers to study individual behavioural patterns (“digital phenotypes”, (jJain 


et al.[ 2015; Onnela[ 2015)) and their changes over time (Miller 2012). Monitoring an individual’s digital 


behavioural patterns on different timescales is also an easy and inexpensive way for medical intervention, 
especially in the case of mental problems, where there are fewer biomarkers than for other types of disease. 
Data from smartphones have already been used to monitor the time evolution of different measures that 
are known to be indicative of behavioural changes in patients, which makes daily monitoring and early 
intervention possible ([Matthews et al.[ |2014[ [Torous et al.[ |2015b[ [Saeb et aL| [2015[). As an example. 


Faurholt-Jepsen et al. (2014) suggest that data from mobile phones can be used as objective measure of 


symptoms of bipolar disorder. 


Because the sleep/wake cycle is a dominant feature of circadian patterns. Big Data describing the digital 
daily cycles of large numbers of individuals might prove to be highly useful for sleep research. However, 
obtaining an accurate picture of the sleep times of individuals requires solving several non-trivial problems. 
While one does not send emails when asleep, emails are not necessarily a reliable proxy for awake-time; 
it is possible to be awake and not send emails. In this sense inferring the actual times of sleep from 
electronic records is challenging. This problem is made more severe by the ubiquitous burstiness in human 
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dynamics (Barabasi 2005; Karsai et al. 2011 Miritello et al. 2011[ ): broadly distributed inter-event times 
make the times from last observation to bed time (or from wake-up to first observation) highly unpredictable. 
Nevertheless, we believe that this is an important direction for future research. 

Finally, a particularly promising source of data comes from large dedicated cell-phone based data collec¬ 
tion efforts, focusing on collecting multiplex (face-to-face, telecommunication, online social networks) 
network data in a large, densely connected populations, e.g., |Stopczynski et~aL] ( |2014[ ). Data from a single 
communication channel can be too sparse and noisy for obtaining accurate daily patterns; here, having a 
multiplex dataset can provide a great advantage since one can combine information from all data-channels 
to form a much more comprehensive picture of the activity of each person (e.g. for studying sleeping 
patterns). Furthermore, if the participants of the dataset are densely connected through social ties, it is also 
possible to investigate the significance of and correlations between the activity patterns of close personal 
relations using such a dataset. Finally, a dataset of this nature may function as a kind of “rosetta stone”, 
helping researchers determine the biases of each electronic dataset, and allowing us to understand to which 
extent telecommunication data or Twitter datasets with hundreds of millions of active users can be used to 
study the daily cycles of individuals. 


5 METHODS 

5.1 Data filtering 

We have used 8-week time slices of all datasets. Filters have been applied to remove users who are 
inactive or whose activity is too low for producing meaningful information on daily patterns. In Table[T] the 
total number of participants means the total number of users who have at least one event during the study 
period of 8 weeks. For plotting aggregate-level patterns (Fig. and Fig.[^, we have used data from all 
participants. The column “Active users” in the table represents the number of users who have at least one 
event per day on average (minimum 56 events in total); these have been used for calculating average daily 
patterns (Fig.|^. For measuring persistence of daily patterns and calculating Jensen-Shannon divergence, 
we used a subset of active users who have at least one event in each of the two time intervals of 4 weeks. 

5.2 Self and reference distances 

In order to quantify the level of persistence of daily patterns for individuals, we compare the daily 
patterns of each ego for two consecutive 4-week time intervals. For this, we use the Jensen-Shannon 
divergence (JSD) and measure the distance of the daily patterns viewed as two probability distributions 
(Pi and Pa). The JSD is calculated as follows: JSD{Pi,P 2 ) = H^Pi + iPa) - ^[Ff(Pi) - ^^(^ 2 )], 
where Pi = p{h) and p(h) is the fraction of calls at each hour, i = 1,2 indicates the time interval, and 
H{P) = — '^p{h) \ogp{h) is the Shannon entropy. In order to compare these self-distances against a 
reference, we calculate a set of reference distances dj-ef the distances between the daily patterns of each 
ego and all other egos in the same time interval. 
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